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Abstract — We propose a new model for peer-to-peer network- 
ing which takes the network bottlenecks into account beyond 
the access. This model allows one to cope with key features of 
P2P networking like degree or locality constraints or the fact 
that distant peers often have a smaller rate than nearby peers. 
We show that the spatial point process describing peers in their 
steady state then exhibits an interesting repulsion phenomenon. 
We analyze two asymptotic regimes of the peer-to-peer network: 
the fluid regime and the hard-core regime. We get closed form 
expressions for the mean (and in some cases the law) of the peer 
latency and the download rate obtained by a peer as well as for 
the spatial density of peers in the steady state of each regime, as 
well as an accurate approximation that holds for all regimes. The 
analytical results are based on a mix of mathematical analysis and 
dimensional analysis and have important design implications. The 
first of them is the existence of a setting where the equilibrium 
mean latency is a decreasing function of the load, a phenomenon 
that we call super-scalability. 

I. Introduction 

Peer-to-peer (P2P) architectures have been widely used over 
the Internet in the last decade. The main feature of P2P is 
that it uses the available resources of participating end users. 
In the field of content distribution (file sharing, live or on- 
demand streaming), the P2P paradigm has been widely used 
to quickly deploy low-cost, scalable, decentralized architec- 
tures. For instance, the ideas and success of BitTorrent HI 
have shown that distributed file-sharing protocols can provide 
practically unbounded scalability of performance. Although 
there are currently many other architectures that compete with 
P2P (dedicated Content Distribution Networks, Cloud-based 
solutions, . . . ), P2P is still unchallenged with respect to its 
low-cost and scalability features, and remains a major actor in 
the field of content distribution. 

The Achilles' heel of todays' P2P content distribution is the 
access upload bandwidth, as even high-speed Internet access 
connections are often asymmetric with a relatively low uplink 
capacity. Therefore, most theoretical models of P2P content 
distribution presented so far have been 'traditional' in the sense 
of assuming a common, relatively low access bandwidth, in 
particular concerning the upload direction, which functions as 
the main performance bottleneck. However, in a near future 
the deployment of very high speed access (e.g. FTTH) will 
challenge the justification of this assumption. This raises the 
need of new P2P models that describe what happens when 
the access is not necessarily the main/only bottleneck and that 



allow one to better understand the fundamental limitations of 
P2P. 

A. Contributions 

A new model. The first contribution of the present paper 
is the model presented in Section 



III which features the 



following two key ingredients which were lacking in previ- 
ous models of the literature on P2P dynamics: 1) a spatial 
component thanks to which the topology of the peer locations 
is used to determine their interactions and their pairwise 
exchange throughput; 2) a networking component allowing 
one to represent the capacity of the network elements as well 
as the transport protocols used by the peers and to determine 
the actual exchange throughput between them. 

More precisely, we consider a scenario where peers 
randomly appear in some metric space, typically the 
Euclidean plane representing the physical distance, and 
download from their neighbors with a throughput that may 
depend on some distance or RTT (it can be the case for e.g. 
TCP transport). The typical P2P application we have in mind 
is a BitTorrent-like file-sharing system. However, the high 
abstraction level of our model also allows for interpretations 
beyond this framework. Using proper QoS requirements, it 
could be extended to any kind of P2P content distribution 
services (like live and on-demand streaming). The space 
could also be a representation of the peers' interests, the 
position of a peer representing its own centers of interest. In 
such a space, two close peers share common interests, and 
therefore are likely to exchange more data. 

A promising form of scalability. The rationale that is 
usually brought forward to explain P2P scalability is that the 
overall service capacity growths with the number of peers. 
This allows the system to reach an equilibrium point no 
matters how popular the service is. This equilibrium was first 
analytically studied in [2], under the traditional assumption 
mentioned above that the upload/download capacity is the 
bottleneck determining the exchange throughput obtained by 
peers. The model proposed in [2| leads to an equilibrium 
point which exhibits the expected scaling property in that 
the service latency can be shown to remain constant when 
the system load increases. In our new model, the equilibrium 
point may exhibit a stronger form of scalability than that 
in 0, that we propose to call super-scalability, where the 



service latency actually decreases with the system load. 

Conditions for super-scalability to hold. As we shall see 
in Sections |II] and IV this super-scalability phenomenon is not 
difficult to understand from a pure queuing theory or graph 
theory viewpoint. Roughly speaking, super-scalability can be 
shown to hold in a queue whenever the service rate of a typical 
customer scales like the number of customers in the system 
(rather than like a constant as in |2|). Equivalently, it is not 
difficult to see that it holds if the peer interaction graph is 
complete at any given time. 

However, in practice, the network cannot sustain arbitrary 
high rates. Also, interactions between peers are limited by 
degree constraints and by the requirement to select peer 
connections with good throughput. Section |VI] combines our 
model together with an abstract network model to determine 
the conditions on the peering rules, on the network capacity 
and on the transport protocols for which the mathematical 
analysis makes sense and for which the super-scalability 
property can possibly survive. 

The laws of super-scalability. The paper also provides a 
full analytical quantification of the system at the equilibrium 
point: in addition to the latency formula, it also provides 
closed form expressions for e.g. the density of peers present 
in the P2P overlay or the rate obtained by each peer, as 
functions of the peering rules and the network parameters. 
These equilibrium laws, which take specific forms for each 
type of transport protocol, are the main analytical contributions 



of the paper. These are gathered in Section IV for the simplest 
scenarios and in Sections IVHI and IVIIII for a few variants that 
can be built on our model: generic rate functions, auxiliary 
servers, seeding behavior of users, access bottleneck condition, 
etc. 

These laws have important P2P implications. In particular, 
they allow one to determine optimal tuning of the parameters 
of the P2P algorithms e.g. the optimal peering degree or the 
best parameters of the transport protocols to be used within 
this context. 

One theoretically novel feature of our model is the proof of 
a repulsion phenomenon which was empirically observed in 
J3]: as close peers get faster rates, they quit the system earlier, 
so a node "sees" fewer peers in its immediate vicinity than one 
would expect by considering the spatial entrance distribution 
alone. All these results are validated through simulations in 
Section [VJ 

B. Related Work 

Our main scenario is inspired by a BitTorrent-like file- 
sharing protocol. In BitTorrent HI, a file is segmented into 
small chunks and each downloader (called leecher) exchanges 
chunks with its neighbors in a peer-to-peer overlay network. A 
peer may continue to distribute chunks after it has completed 
its own download (it is called a seeder then). Theoretical 
studies and modeling have already provided relatively good 
understanding of BitTorrent performance. 



Qiu and Srikant [2 1 analyzed the effectiveness of P2P file- 
sharing with a simple dynamic system model, focusing on 
the dynamics of leechers and seeders. Massouli and Vojnovic 
[4] proposed an elegantly abstracted stochastic chunk-level 
model of uncoordinated file-sharing. In the case of non- 
altruistic peers (who do not continue as seeders), their results 
indicated that if the system has high input rate and starts with a 
large and chunk-wise sufficiently balanced population, it may 
perform well very long times without any seeder. However, 
instability may be encountered in the form of the "missing 
piece syndrome" identified by Mathieu and Reynier [5|, where 
one (and exactly one!) chunk keeps existing in very few copies 
while the peer population grows unboundedly. Hajek and Zhu 
[6 1, [7 1 proved that the syndrome is unavoidable, if the non- 
altruistic peers enter empty-handed and if the peer arrival 
rate is larger than the chunk upload rate offered by persistent 
seeders. On the other hand, they also proved that the system 
becomes stable for any input rate, if the peers have enough 
altruism to stay as seeders as long as it takes to upload one 
chunk. The missing piece syndrome can be avoided even in 
the case of non-altruistic peers by using more sophisticated 
download policies at the cost of somewhat increased download 
times, see (8), (9), iMCA . The above results were obtained in 
a homogeneous, potentially fully connected network model. 
The present paper introduces a much less trivial family of 
peer interaction models, focusing on a bandwidth-centered 
approach similar to the one proposed by Benbadis et al. ifTTI . 
To avoid excessive layers of complexity, we neglect chunk- 
level modeling in this phase, although realizing that meeting 
the rare chunk problem will modify and enrich the picture in 
future research. 

The natural feature of large variation of transfer speeds in 
P2P systems has been considered in a large number of papers. 
For example, part of the peers can rely on cellular network 
access that is an order of magnitude slower than fixed network 
access used by the other part. Such scenarios differ however 
substantially from our model, where the transfer speeds depend 
on pair-wise distances but not on the nodes as such. 

There are some earlier papers considering P2P systems in 
a spatial framework. As an example, Susitaival et al. Ifl2ll 
assume that the peers are randomly placed on a sphere, and 
compare nearest peer selection with random peer selection in 
terms of resource usage proportional to distance. However, the 
distance has no effect on transfer speed in their model. Our 
paper seems to be the first where a peer's downloading rate is 
a function of its distances to other peers. 

II. Super-scalability Toy Example 

Consider a system in steady state where jobs arrive to get 
some service. This system will be said to be super-scalable if 
the mean job latency decreases when the arrival rate increases 
and all other system parameters remain fixed. 

In order to understand how super-scalability can arise, we 
propose the following two toy examples: consider a system 
where peers arrive and want to download some file of size 



F. Peers arrive in the system with intensity A and leave the 
system as soon as their own download is completed. 

In our first toy example, the access upload bandwidth is 
considered as the main bottleneck. If we neglect issues related 
to data/chunk availability, and if U is the typical upload 
bandwidth of a peer, then it makes sense to assume that U 
is also the typical download throughput experienced by each 
peer. In particular, in the steady state (if any), the mean latency 
W and the average number of peers N should be such that 

F XF 
W = — and N — XW — — (Little's Law). (1) 

Although very simple, ([TJ contains a core property of standard 
P2P systems: the mean latency is independent of the arrival 
rate. This is the scalability property, which is one of the main 
motivations for using P2P. 

Now, imagine a second toy example based on a complete 
shift of the bottleneck paradigm. Let the main resource bot- 
tleneck be the (logical, directed) links between nodes instead 
of the nodes themselves. We should then consider the typical 
bandwidth C from one peer to another as the key limitation. 
If each peer is connected to every other one (the interaction 
graph is complete at any time), then the equilibrium Equation 
([TJ should be replaced by 

F 

W — -y^ — and N — XW, which leads to 
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The behavior of this new system is quite different from the 
previous one. Among other things, the service time is now 
inversely proportional to the square root of the arrival intensity, 
so that super-scalability holds. 

In this toy example, the central reason for super-scalability 
is rather obvious: the number of edges in a complete graph is 
of the order of the square of the number of nodes, and so is 
the overall service capacity. 

The main question addressed in the present paper is to better 
understand the fundamental limitations of P2P systems and in 
particular to check whether super-scalability can possibly hold 
in future, network-limited, P2P systems, where the throughput 
between peers will be determined by transport protocols and 
network resource limitations rather than the upload capacity 
alone. This requires the definition of a new model allowing 
one to take both the latter and the former into account as 
well as the limitations inherent to P2P overlays like e.g. the 
constraints on the degree of the peering graph, the availability 
of data/chunks, etc. 

III. Network Limited P2P Systems 

The aim of this section is to define a model meeting all the 
above requirements. 



A. Dynamics 

Our peers live in a spatial domain D. The domain can be 
some general Euclidean or even abstract metric space. It can 
describe physical distance between peers, distances derived 
from metrics in the underlying physical network, or even 
represent some semantic space. 

For simplicity, we focus on a basic model where D is the 
Euclidean plane M 2 , but there is no basic difficulty in extend- 
ing this framework. We also use sometimes an arbitrarily large 
torus as an approximation of D. 

Assume that new peers arrive according to some time-space 
random process. The set of the positions of peers present at 
time t is denoted by $ t . 

Each peer p has an individual service requirement F p > 0. 
In the basic example where the service required by every peer 
consists of downloading one and the same file, F p would most 
naturally be modeled as a constant F describing the size of 
the file. 

We assume that two peers at locations x and y serve each 
other at rate f(\\x — y\\), where / is a non-negative function 
which we call the bit rate function of the modeQ This function 
describes the network transport and connectivity limitations. 
We will see later how these limitations can be taken into 
account. 

In order to focus on bandwidth aspects, we do not explicitly 
take into account issues related to chunk availability. Follow- 
ing the approach proposed by 121, we assume that filesharing 
effectiveness can be affected by some factor 77 < 1 because 
sometimes, a peer may not have any chunk that a neighbor 
would want. In the following, we omit 77 by assuming that file 
sizes are always scaled by a factor i. We are aware, however, 
that handling chunk availability through a constant r\ has some 
limitations, and we will point out the scenarios where chunk 
availability can become a real issue. 

The services received from several peers are additive, so 
that the total download rate of a peer at x is 



K x ><f>t) 



ye<l>t\{x} 



By symmetry, fi(x, <f> t ) is also the upload rate of a peer at 
x. In order for the access not to be a further limitation, the 
access capacity of a peer at x should exceed (i(x, <f>t). This is 
our default assumption here (access as a possible bottleneck 
is considered in Section fVin) . 

A peer p born at point x p at time t p leaves the system when 
its service requirement has been fulfilled, i.e. at time 

T p = inf{/j > t p : / p,(x p , 4> s )ds > F p }. 

A peer is usually called a leecher if it has not completed its 
download, and seeder if it has. Although this paper is mainly 
focused on leechers-only system, the situation where peers 

'We implicitly assume that bandwidth rates are automatically adjusted by 
the system, at the network layer, in a TCP-like fashion, or at the applicative 
layer, using a UDP-like approach. 



continue as seeders after having completed their service will 
be considered in Section IVlIII 

B. Examples of Bit Rate Functions 

We will consider two basic cases throughout the paper: 

1) peers use a TCP -like congestion control mechanism; 

2) peers use UDP. 

In P2P, UDP is often used in place of TCP. However, P2P- 
over-UDP protocols try to be TCP-friendly 03, 03): they 
are designed to respect TCP flows and actually mimic TCFrl 
Consider first the case where peers use TCP Reno. On 
the path between two peers, let d denote the packet loss 
probability and RTT denote the round trip time. Then the 
square root formula ifTSIl stipulates that the rate obtained on 



this path is 



with i =~ 1.309. Assuming the RTT to 
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be proportional to distance r yields a transfer rate of the form 



9(r) 



C 



(3) 



We can refine ^ by assuming that RTT is not simply linear in 
r but some affine function of it, namely RTT = ar + b, where 
a accounts for propagation delays in the Internet path and b 
accounts for the mean delay in the two access networks. Then 
the transfer rate between two peers with distance r becomes 

With C=^=, q= b . (4) 

Another natural model is that where one accounts for an 
overhead cost of c bits per second. The transfer rate between 
two peers at distance r is then 



9(r) 



9(r) 



c 



, with (.)+:= max(.,0). (5) 



In the case where peers use UDP, on the path between two 
peers, the transfer rate is of the form 



g(r) = C, where C is a constant. 



(6) 



C. Connectivity Limitation 

Having specified some transfer rate function g, we notice 
that a peer cannot interact with all other peers of the overlay 
network: it would result in a full mesh overlay, impossible 
to handle for large networks. Therefore, peers usually limit 
their neighborhood, for instance by selecting only peers within 
a certain distance and/or by limiting its total number of 
neighbors. This constraint is even more meaningful in the 
wireless contex, as it can correspond to some transmission 
range. This leads to the following choices for the bit rate 
function: 

• Constant Range model: take f(r) — g{r)l r <R (R is 
called the range), so that 



E 



||x 4 -x||<H5(lki-a:||)» ( 7 ) 



2 For instance, TFRC (www.ietf.org/rfc/rfc3448.txt) recommends that UDP 
flows use the square root formula to predict the transfer rate that a TCP flow 
would get and use this rate for throttling their traffic. The TCP model is hence 
directly applicable to such a setting. 



Name 


Description 


Units 


Ctcp 


Speed parameter 


bits ■ s~ 1 • m 


CjjDP 


Speed parameter 


bits ■ s~ 


F 


Mean file size 


bits 


R 


Peering range 


m 


X 


Leecher arrival rate 


m~ 2 ■ s~ L 


W 


Mean latency 


s 


M 


Mean rate 


bits ■ s _1 


U 


Upload bottleneck 


bits ■ s _1 



TABLE I 
Table of Notation 



where g is one of the functions considered above. 
• Constant Number of Nearest Peers model: take the L 
closest peers as the set of communicating neighbors. This 
rule is non-symmetric and difficult to deal with exactly. 
To begin with, computing the effective rate / between to 
peers at x and y is not a function of \\x — y\\ only, but 
of the configuration </> t . 

In this paper, the main model will be that where the transfer 
rate between two communicating peers is given by Q or (|6]) 
and where the range is constant. More general rate functions 
(e.g. as defined in Q and |5]l) and an approximation of 
connectivity defined by the number of peers will be analyzed 
in Section IVIIll 

Let us stress again that the framework can be extended 
to more general metric spaces and/or to more general rate 
functions. For instance, in a noise limited wireless network 
of the Euclidean plane, it makes sense to assume that the 
rate between two peers at distance r is determined by some 
Signal to Noise Ratio condition and is hence proportional to 
log (l + with a > 2 the path loss exponent. Of course 
the additive assumption on the point-to-point rates only makes 
sense in rather particular cases (e.g. orthogonal channels) and 
more general models should be considered within this wireless 
setting. We will not pursue the general wireless setting in the 
present paper. We will however consider more general rate 
functions than the above TCP and UDP functions in Section 



VIII including the above additive wireless setting, which will 
be referred to as the SNR model. 

D. Mathematical Assumptions 

We assume that new peers arrive according to a Poisson pro- 
cess with space-time intensity A ( 'Poisson rain '). A, expressed 
in m" 2 .s _1 , describes the birth rate of peers: the number of 
peer arrivals taking place in a domain of surface A (expressed 
in m 2 ) in an interval [s,t] (in seconds) is a Poisson random 
variable with parameter XA(t — s). 

For the sake of mathematical tractability, we assume the 
Fp's to be independent and identically distributed random 
variables with finite expectation, denoted by F = E(F p ). 
More specifically, we assume in this paper that their common 
distribution is exponential of mean F in order to gain in 
mathematical tractability. 

Proposition 1: If the domain D in which the peers live is 
compact, then </> t is a Markov process which is ergodic for 
any birth rate A > 0. 



The proof, which can be found in Appendix [X] is based 
on a domination argument which can easily be extended to 
unbounded domains. The existence of stationary regimes for 
(fit in the case of an unbounded domain then follows from 
this and a tightness argument. However, the ergodicity of 
(fit and the uniqueness of its stationary regimes cannot be 
established as easily in this case. Garcia and Kurtz |[T6l proved 
the existence and ergodicity of a wide class of attractive spatial 
birth-and-death processes in infinite domains. Extending their 
approach to our repulsive case (see below for the terminology) 
seems feasible but goes way beyond what can be done within 
the space limitations of the present paper. In what follows, 
for results stated on the (infinite) Euclidean plane case, we 
conjecture that the spatial birth-and-death processes of interest 
admit a unique stationary regime. In any case, all our results 
can be rephrased on a large torus where this conjecture is not 
needed. 

IV. Mathematical Analysis 

In this section, we focus on the main model under TCP 
Q with fixed range R. The results on UDP |6) are provided 
as well. We adopt the same strategy concerning the proofs as 
above for Proposition [T[ we give proofs in the torus case, so 
as to provide the main ideas, but refrain from discussing their 
extensions to the infinite Euclidean plane. The limiting argu- 
ments for these extensions are left for future work. The final 
formulas are however always given in the infinite Euclidean 
plane where they have a particularly simple form. 

For the main model, the system has 4 basic parameters: the 
range R in meters (to), the typical filesize F in bits, the peer 
arrival rate A in mT 2 -s~ x and a rate constant C in bits-m-s^ 1 
(bits ■ s^ 1 in the UDP case). 

According to Proposition [T] the model admits a steady state 
regime where the peers (in the basic model all leechers) form 
in R 2 a stationary and ergodic point process ifTTl . 

We denote by f3 a the density of the peer (leecher) point 
process, by /x the mean rate of a typical peer, by W a the 
mean latency of a typical peer, and by N a the mean number 
of peers in a ball of radius R around a typical peer, all in the 
steady state regime of the P2P dynamics. 

In the following, we will also consider several approxima- 
tions of the main model: 

• a fluid regime/limit, where the corresponding quantities 
will be denoted by a / subscript (e.g. /3j); 

• a hard-core regime/limit for which we will use the 
notation . h (e.g. p h ); 

• a heuristic description of the main model with a hat 
notation (e.g. /3q) 

In any of these regimes, Little's law tells that the average 
density verifies (3 = XW. 

A. Fluid Limit 

The fluid limit consists in assuming that the density is 
uniformly distributed in space at any time. In particular, in the 
fluid limit, the presence of one single peer in a given point 
does not impact the system. 



From Campbell's formula [17|, the mean total bit rate of a 
typical location of space (or equivalently of a newcomer peer) 
is 

(ji, f =0 f 2n I {C/r)rdr = (3 f 2nCR. (8) 

Jr=0 

Now, the fluid limit hypothesis allows one to assume that a 
peer sees fij during its whole lifetime. We get that the mean 
latency of a peer is 

W f = — . (9) 



Hence 

fill = XF. 

From (j8), |9| and ( fTO) , we have 



(10) 



In the fluid limit, the mean number of peers in a ball of radius 
R around a typical peer is 



N f = TiR 2 p f = 



XFR 3 



2 V C 

For UDP, the same reasoning gives: 



(12) 
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UDP 
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M /jUDP = \/XFirCR 2 , 
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/,UDP 



XFR 2 



C 



(13) 



As we see in the expression for the mean latency in (jTTJ and 
fPS] ) both the TCP and the UDP fluid limits exhibit the same 
super-scalability as the toy example: in spite of the fact that 
the interactions are not as in the complete graph and depend 
on the distance, the mean latency decreases in -4= when A 

V A 

tends to infinity and everything else is fixed. 

B. Dimensional Analysis 

At this point of the paper, the fluid limit is a thought experi- 
ment, not necessarily related to the actual model. Dimensional 
analysis ifTSl helps to address this issue. 

We first use the 7r-theorem llT8l to strip our problem from 
redundant variables: if we choose R as a new distance unit, 
then the arrival intensity becomes I = XR 2 , the download 
constant becomes c = C/R and the other parameters are 
unchanged. If we now define F as an information unit, then the 
download speed constant becomes c = C / (RF) and the other 
parameters are unchanged. Finally, if we take a time unit such 
that the download speed constant is 1, we get a system where 
all parameters are equal to 1 but for the arrival rate which is 
equal to I — XF J I . As the system itself is not affected by the 
choice of measurement units, all its properties only depend on 
the (dimensionless) parameter 

XFR 3 



P — Ptcp 



C 



(14) 



The 7r-theorem allows some freedom in the choice of the 
parameter. By noticing that Nf — \[\\fp, we can use Nf, 



which has a physical interpretation (the number of neighbors 
predicted by the fluid limit), instead of p. 
By similar arguments, we have 



PUDP 



XFR 2 
C ' 



(15) 



so we use AT/,udp = V^^Pvbp- 

The 7r-theorem tells that all systems that share the same 
parameter Nf are similar. Now consider the union of two 
independent systems that use the same parameters (A, F, C, 
R): the real model, with latency W , and the fluid model, with 
latency Wf. The ratio ^ is a dimensionless property of the 
overall system, therefore it is a function of Nf only. In other 
words, there exists a dimensionless function M(Nf) such that: 

W = M{N f )W f . (16) 

From Little's law, we also deduce the density: 

fa = PfM(N f ). (17) 

These equations are true for both the TCP and UDP rates 
(with a different M function in each case). 

To summarize, although our system may be subject to com- 
plex interactions and is defined by four independent parame- 
ters, dimensional analysis allows one to express its general 
behavior through a one-parameter function M (unknown), 
which expresses how far the real system is from its fluid limit. 

C. Fluid as a Bound 

We now give a better understanding of the behavior of the 
real system through the following theorem. 
Theorem 1 (Repulsion): In the steady state, 



>e [ y, /(in 

x z £<!>\{0} 



(18) 



where Eo denotes expectation w.r.t. Po, the Palm probability 
ifTTl w.r.t. the point process $. 

The proof is given in Appendix |XI| in the torus case. Theorem 
[T] says that there are less points (in terms of their /-weight) 
in a ball of radius R around a typical peer (i.e. under the Palm 
probability) than in a ball of the same radius around a typical 
location of the Euclidean plane (i.e. under the stationary 
probability P). This is what we call a repulsion effect. 

Corollary 1: M > 1. 
Proof of corollary: Theorem [T] is equivalent to saying that 
P-o < fa2irCR. This, the relation W > F//.i a (which is 
obtained by a direct convexity argument) and Little's law 
fa — \W imply that fa > 2 nCR wmcn i n implies 
Po > Pf and M > 1. 

In other words, repulsion implies that the fluid regime is 
actually a lower (resp. upper) bound for the mean latency and 
the peer density (resp. the mean rate). Now, the following 
theorem tells that the bound is tight. 

Theorem 2: When Nf tends to infinity, M tends to 1, and 
the law of a typical peer latency converges weakly to an 
exponential random variable of parameter Wf. 



The sketch of proof is given in Appendix XII where it is 
shown that this regime is such that not only the traffic is high 
but the peers also stay long enough to make the fluctuations 
slow and weak. By the almost constancy of the rate at any 
point, we get the almost exponentiality. 

Theorem [2] says that when the number of neighbors pre- 
dicted by the fluid limit tends towards infinity, the system 
behaves like its fluid limit. 

D. Hard-Core Regime 

A stationary point process is hard-core with exclusion 
radius R if there is no pair of points in the point process 
with a distance less than R. 

Conjecture 1: When Nf tends to 0, NfM(Nf) tends to 
1, and the stationary peer point process tends to a hard-core 
point process with exclusion radius R, with intensity fa and 
latency Wh defined as follows: 



1 



1 



ttR 2 ' " n XttR 2 ' 
Moreover, the cdf of the latency converges weakly to 



1 



t > 0. 



(19) 



(20) 



Conjecture [T] is supported by simulations (cf. Section [V]), and 
by the following insight on the hard-core behavior: when two 
peers are at distance r < R, the average time under TCP for 
one of them to disappear is less than ^ < ^jj-. If Nf <C 1, 
( fTTj ), (jT2) and Corollary [l] give ^£ < W . In other words, 
when two peers are in range, one of them disappears almost 
instantly compared to the typical latency of the system, so 
when we take a snapshot of the system at a given time and 
finite area of space, it is likely that we see only peers out of 
range R from each other. 

A similar reasoning stands for UDP. 



It is worthwhile mentioning that according to (19i, the 
volume fraction of the associated sphere packing model is 
1/4 (since we have a density of non-intersecting balls 
of radius R/2). This volume fraction is hence the same as 
that of the Matern hard-ball model in the so called jamming 
regime (see e.g. lfT9l ). 

Let us stress that this hard-core regime is hardly desirable 
(performance largely below the one predicted by the fluid limit 
and extreme unfairness). Moreover, peer data exchanges are 
very sparse, so the fluid assumption on the exchange of chunks 
fails to hold. Chunk availability becomes probably a bottleneck 
as important as bandwidth under these conditions, suggesting 
that the performance will be even worse should we take chunk 
exchanges explicitly into account. 

For all these reasons, the hard-core regime, which we 
presented for completing the description of our model, should 
be avoided by all means. The discussions on design will hence 
be in part focused on the tuning of the system parameters that 
avoid this regime. 



E. Heuristic 



For intermediate values of Nf, where fluid and hard-core 
limits do not apply, we propose a first order approximation. 

For TCP, it consists in approximating M by M, the unique 
solution in [l,oo) of 



M 1 
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(21) 



In order to derive ( f2T| , we use a heuristic factorization of the 
factorial moment measure of order 3 ifTTl which is described in 
Appendix |XIII| Informally, the method consists in computing 
an approximation u of the average rate of a peer assuming 
that: (i) a neighbor at distance r from that peer "sees" a rate 
u a + — ; (ii) in return, the peer "sees" at distance r a density 
of neighbors ^ x ^o (using (fTOjt). 

Under this approximation, the fluid equation ([8]) now be- 
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2Nf, equation 



Using /t 
(|2T} follows. 

This heuristic is in line with Theorernp] and Conjecture [T] 
When Afy tends to oo, it follows from pi) that M ~ 1. This 
is in line with Theorem [2] When Nf tends to 0, expanding the 
log in ( |2T| gives M ~ which substantiates Conjecture jlj 



In the UDP case, the same heuristic leads to 
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2N f J 2N f ' 
which also supports both Theorem [2] and Conjecture [T] 

F. Toy Example Revisited 

We revisit the example of Section |TT] within the more precise 
framework considered in the present section (Poisson arrivals, 
exponential file size). This toy example can be seen as the 
UDP case on the torus, when the range is large enough for all 
pairs of peers to be within range. Assume the surface of the 
torus to be 1 . Then, geometry disappears and we have a birth 
and death process for the total population with birth rate A 
and death rate in state i equal to p{i) = — 1). The state 
space is that of positive integers. The local balance equations 
read 

7r(i — 1)A = iz(i)pi(i — 1), i > 2. 



The solution is 

7r(i) = 7r(l) 



where p = Hence the mean is 
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with B the Bessell function. 
In words, we have 



P = /3fM(N f )wiih 



Pf = N f 
M(X) = 
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and 
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We recognize the approximation Q, which implies super- 
scalability in corrected with an M function like in \Y1) . 
We remark that for this toy example, we have the exact value 
of M and not an asymptote or a heuristic, and we can verify 
that Corollary [T] and Theorem [2] hold. 

Notice also that if 5(i) = pi, then — %eT p for all 
i > so that N = p and W = 1/p. In this case, which 
in essence is that of |2|, or equivalently that of the M/M/oo 
queue, where we have scalability but no super-scalability. 

V. Simulation Results 

In this section, we validate our results and substantiate our 
results by means of simulations. For sake of computability, 
we approximate the infinite space by a torus of radius 1. For 
concision, we only present here the simulation results for the 
TCP case, but UDP results are completely similar. 

As stated by the dimensional analysis, all systems can be 
described by the function M. The goal of the simulation is then 
to sample that function. We just have to fix three independent 
parameters and use the fourth one to run through all possible 
scenarios. 

We decide to choose the following fixed parameter: R = .1, 
which gives a good trade-off between the torus as an approxi- 
mation of the plan; C = 1 (arbitrary choice); Wf — 100. The 
last choice means that remaining free parameters are adjusted 
so that ( fTT) values to 100 in each experiment. That way, for 
all simulations, the fluid model will predict the same mean 
latency, so the measured latencies will give M directly, up to 
a constant factor Wf. 

We naturally use Nf (defined by ( [12] )) as the variable 
parameter. We use Nf instead of p as main dimensionless 
parameter because it is strictly equivalent from the point of 
view of dimensional analysis, yet it gives a direct meaning 
to the variable (average number of neighbors in the fluid 
model). The remaining input parameters of the system are then 
completely defined: 
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We choose to use a discrete time simulator, with elementary 
time step set to t = 1. With our settings, the resulting 
step transitions are empirically small enough for the discrete 
model to be a good approximation of the continuous model. 
In the end, we get a simulator that achieves the needed trade- 
off between speed and accuracy (an event-based simulator, 
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Fig. 1. M(N f ) in the TCP case. 



for instance, would give exact rendering of the continuous 
model but would require a lot more of computation). For each 
considered setting, the simulation runtime was adjusted so that 
about 20000 peers could be observed in stationary state. All 
results presented are obtained through 10 runs per setting. 

A. Properties of the M Function 

We propose to start with a global study of the function M. 
We made simulations for Nf varying from 1/32 to 64. Results 
are displayed Figure [T] 

The empirical results are compared with 1) the fluid limit, 
1, 2) the hard-core limit, and 3) the heuristic formula 

©• 

Figure [T] allows us to check almost all results from previous 
section in one look: 

• the fluid limit is a lower bound of the actual system 
(which is equivalent to Theorem [TJ; 

• as Nf goes to oo, the fluid bound becomes tight (this is 
Theorem [2}; 

• as Nf goes to 0, the system behavior converges towards 
the hard-core limit (this is Conjecture flj. 

Additionally, one checks that the heuristic pTj ) gives a good 
approximation of M for intermediate values of Nf, while 
converging to the hard-core and fluid limits when Nf goes 
to and oo respectively. 

B. Fluid Model 

We now propose to focus on the case Nf = 64, in order to 
analyze the system in detail when it reaches the fluid limit. The 
value M(64) given by simulations is 1.007, which is higher 
than 1 yet very close to it, as predicted by Theorem [2] 

If one looks at the latency distribution, it is almost in- 
distinguishable from an exponential distribution of mean Wt 
(Figure |2]i as predicted by Theorem [2] 
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Fig. 2. Cdf of latency for Nf = 64. 

In the fluid model, it is quite difficult to distinguish the sys- 
tem from a spatial birth and death process of birth parameter A 
and death parameter 1/Wf, namely a Poisson point process of 
intensity /3f, Differences can only be spotted if small distances 
are involved. More precisely, two peers at distance r have a 
mutual latency influence of so one can expect Palm effect 
to become less visible when ^ is large enough compared to 
Wf. This allows us to show that -3- is the critical distance 
below which the Palm effects become difficult to neglect. For 
N f = 64, this gives J| w 0.016. 

In our case, the best way to differentiate the actual process 
from a Poisson process is to consider how far the closest 
neighbor of a peer is. While for a Poisson process the distance 
should be i « 0.0111 in average, simulation shows 

2y/\Wf b 

an actual average distance of 0.0115: the nearest neighbor is 
slightly farther away by about 4%. If we go into detail by 
comparing the two distributions, it appears that the main gap 
appears for small distances (cf. Figure [3}, which supports the 
concept of critical Palm distance: if a peer gets a very close 
neighbor, both rates will be higher than usual, so one of them 
is likely to leave sooner, lowering the probability of finding 
very close neighbors in a random configuration. As Nf tends 
towards oo, we expect this difference to become negligible: the 
probability to get a neighbor so near that it will significantly 
affect the total rate becomes arbitrary low, so the repulsion 
effect becomes negligible. 

C. Hard-Core Model 

We conduct the same type of detailed study for Nf = 1/32. 
For these parameters, the value M(l/32) is now 31.6, to 
compare with the hard-core model prediction Mh = 32; so 
the accuracy of the model is pretty good. 

Figure [4] displays the latency distribution, using for compar- 
ison the hard-core distribution and the exponential distribution 
of parameter W a . One observes a close fit to the one proposed 




Fig. 3. Cdf of nearest neighbor distance for Nf = 64. Rg 4 cdf of la(ency fol Nf = 1/32 . 



by the distribution function (20i of Conjecture [T[ when a 
peer arrives, with probability one half, it disappears instantly; 
otherwise it follows an exponential distribution of average 
2Wh- In other words, not only the mean latency is much larger 
than in the fluid model (by a ratio ^-), but half of the peers 
will get a service time arbitrary larger compared to the other 
half (as Nf goes towards 0). 

The distribution of the closest neighbor is also of interest (cf. 
Figure [5J; the distribution has been truncated to the maximal 
distance R, as a peer does not "see" beyond R. 

We see here the repulsion effect at its paroxysm: there are 
many orders of magnitude between the empirical distribution 
and the equivalent Poisson distribution. For instance, Poisson 
says that the probability to have at least one neighbor in range 
is 1 — e -*nWR g2.6%. In the stationary regime, this 
probability is only 0.078%, whereas the hard-core conjecture 
tells us that it will continue to decrease as Nf goes to 0. 

D. Intermediate Values 

We have no good formal description of the actual laws 
observed for intermediate values of Nf, these distributions 
show a compromise between the equivalent fluid and hard- 
core distributions. 

In order to compare with the fluid and hard-core limits, we 
give the latency distribution (Figure|6]) and the closest neighbor 
distribution (Figure |7ji for Nf = 1. One can see that these 
distributions show a compromise between the equivalent fluid 
and hard-core distributions. 

E. Summary of Simulations 

For both the fluid and hard-core limits, simulations validate 
that we have a good description of the average system perfor- 
mance defined by M, but also of the latency distribution. For 
intermediate states, although the bounds still hold, it is better 
to rely on the heuristic, which gives quite accurate results on 
M, but with no details on the distribution. 
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Fig. 5. Cdf of nearest neighbor for Nf = 1/32. 



VI. Network Capacity Constraints 

The aim of this section is to determine the capacity required 
for the network elements in order to achieve the super-scalable 
regime identified above. 

More precisely, so far, the only assumptions on the network 
were that 1) the access is not a limitation anymore (or not the 
only bottleneck); 2) the network is a bottleneck, resulting into 
a rate between peers that depends on their distance and some 
range or degree constraints. 

This section introduces an abstract network model on which 
the P2P traffic will be mapped through some natural shortest 
path routing mechanism. We then determine the mean flow 
that traverses a typical network element. This flow of course 
depends on the protocols used in the network which in turn 
determine the bit rate function. 

For simplicity, we limit the study to the fluid limit of the 
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Fig. 6. CDF of latency for N f = 1. 
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Fig. 7. CDF of closest neighbor for Nf = 1. 

system. 

A. Network Capacity Model 

We consider an underlying network made of routers and 
links between them. A simple example is that where 

(i) routers form a Poisson point process of intensity 9 in the 
plane; 

(ii) links are the Delaunay edges (see e.g. [19|, Chapt. 4) on 
this point process; 

(iii) each peer is directly connected to the closest router and 
the path between two routers is the shortest path (with 
minimal hop count) on the Delaunay graph. 

In this case, the number of links between two peers is 
asymptotically proportional to the distance between them 1 19 1. 

For all straight lines of the plane, the point process of 
intersections of this line with the edges forms a stationary 



point process of intensity fi e :— 2\[9 on the line. Denoting by 
K the capacity of an edge, we get a total capacity per unit 
distance of 5 := fi e K. 

Now, in order to simplify the evaluation of the P2P load 
on the underlying network, we will assume that (a) 9 is 
large enough so that the hop-count between two peers can be 
seen as proportional to their distance and the flow between 
them as a straight line; (b) Any rate smaller than "El can be 
transported through a segment of length I. 

Remark In order to further justify the formula f(r) — —l r <n 
for the rate of two peers at distance r within the refined 
network model presented above, one can use the bandwidth 
sharing formalism of l20l . A connection of Euclidean length 
r uses approximately 7^ links where C = is the mean 
length of a Delaunay edge of a Poisson point process of 
intensity 9 (see ET1 p. 477) and where 7 is the (stretch) 
constant of the shortest path algorithm (see fl9l . Vol 2, Prop 
20.7). We assume that each link is of capacity K. We consider 
the network as an open bandwidth sharing network l20l 
with connections of various classes arriving to the network, 
transferring a file of mean size F and leaving the network. 
We write the bandwidth optimization problem in any given 
state in this network as 

max ^ log (j/;) 

i 

under the constraints 

iec, 

where vt is the rate of connection i and Cj is the collection of 
connections that traverse link j in this state. Denoting by ctj 
the Lagrange multiplier associated with constraint j, we get 
that at the optimum point, for all i 



In the steady state regime (in both time and space), the 
sequence aj should be stationary and ergodic. So, when 
denoting by a its mean, when 9 is large, card{j : i G Cj} is 
large too and we get from spatial ergodicity that if connection 
i is of length r, namely uses jr/e p links, then Vi « -77^, 
with l(vi) = card{j : i G Cj} m 7J. Hence Vi w f° r 
r <R. 

B. Flow Equations 

For the sake of easy exposition, we start with the model on 
the line. The flow through the origin is 

Xi<olx,->r> 

In the fluid model, we can use the fact that the second moment 
measure of $ is /3 2 times the Lebesgue measure on W 2 and 
Campbell's formula to get that 

i\> = 2 [ [ f(y - x)(3dx(3dy = 2f3 2 [ rf{r)dr. 

Jx<0 Jy>a Jr>0 



The last expression comes from the change of variables r := 
y — x, x := x. Consider now the model on the plane. Let 

Xi = (^,xf). 

We make here the assumption that the bit flow between any 
two peers follows a straight line in the plane, and that the 
network capacity is defined by some constant S, expressed in 
bits.s^ 1 .m — 1, such that the maximal flow rate that can go 
through a segment. 

Let ^(e) be the rate that goes through a segment S of length 
e. We can choose for instance S = [(0, — §), (0, §)]. Let H~ 
denote the left half-plane and H + the right half plane. Then 
*(e) is 



*(e) = 2E ^ /(pQ-X.-Dlp^.jn^ 

Xi e <s>n H~ , 
Xj e $ n h+ 



= 2 JJxeH-, f(\X-Y\)l [x ,Y)ns^PdXl3dY 
y e h+ 

= 2(3 2 JJ xeH _ f(\Z\)l lx , z+x]nS &dXdZ 
z e h + 

= 4(3 2 (I r>Q f{r)rsm(8)erdrd8 

= 4/3 2 e / r 2 f(r)dr 

where the third line comes from the change of variables Z := 
Y — X, X := X. So, by isotropy, the flow per unit length 
through any line of the plane is 

* = *(1) = 4/3 2 / r 2 f(r)dr. 

Jr>0 

Using the fluid expression of the density 

/3 = /3 / 



XF 



2lT Jr>0 r f( r ) dr 

we get the following key relation 

2 f^ n r 2 f(r)dr 
y = V(l) = -\F Jr>0 . (28) 

n Jr>o r Jy r ) dr 

In the TCP case (/(r) = f l r < fl ), we get 

*TCP = ^CfeR 2 = ^XFR. (29) 
In the UDP case (/(r) = Cl r < R ), we get 

*UDP = \cf3 2 eR i = ^XFR. (30) 

For the network to sustain the rate generated by our model, 
it is required that 

*<S. (31) 

If one can assume, under some joint fluid limit, that both 
the flow and the number of links going through a segment 
are asymptotically deterministic, then Condition ( (3T| is also 
sufficient for stability. Studying the validity conditions of this 
hypothesis is, however, beyond the scope of this paper. 



Note that for both TCP and UDP, the condition (fJT) does 
not depend on C. This surprising result means that in the 
fluid limit, we can arbitrarily scale the individual rate of 
connections (thus decreasing the latency) without changing 
the burden on the underlying network. Of course, there is a 
flaw in that reasoning, which is that increasing C eventually 
impairs the validity of the fluid limit. In details, as C increases, 
Nf gets smaller so we tend to the hard-core limit where (i) 
there is unfairness as half of the peers get almost instant 
service compared to the other half; (ii) the average latency 
reaches an asymptotic value . R2 , so further increase of C is 
meaningless. 

VII. More General Rate Functions 

While we focused on TCP-like ^ and UDP-like ^ func- 
tions, all our results can easily be generalized in the fluid limit 
to any rate function / such that j r>Q rf(r)dr < oo. Even if 
/ has no maximal range R, we just have to replace CR in ([8]) 
by J r>0 rf(r)dr and proceed. This gives 



fif = /3fj, with 7 = 2-7T / rf(r)dr. 
Once 7 is known, we can generalize ( fTT] i by 




(if = y/XFj, W f 



(32) 



(33) 



Notice that the scaling in ^= still holds. 

Without a range R, Nf, which is TrR 2 f3f, is not properly 
defined, which impairs a direct introduction of M. However, 
if we have J r>Q r 2 f(r)dr < oo, we can use 
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(34) 



instead of R and extend the dimensional analysis accordingly 
(R being interpreted as the typical range of /). 

Let us illustrate this method with a few concrete examples 
of type /(r) = g(r)l r < R . 

A. Affine RTT 

If g is given by Q, then then the mean bit rate of a typical 
location of space is 

cR 



C 



/r=0 r ~ 
so that we have 
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r=0 r + 1 

B. Overhead 

For g as in 0, after noticing the necessary condition R < & 
(each connection needs to use a minimal bandwidth c for the 
overhead), we get 



H f = P f 2ir 



r=0 
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r dr 



so that 
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The best value for R is R 



= 2n \^RC 
— , which gives 7 
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C. Per Flow Rate Limitation 

The protocol or some physical constraints may limit the 
individual rates. If one assumes a maximal rate U for each 
flow, we have g(x) — (C/x) A U. This gives 

C \ ( kUR 2 if C> UR 
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otherwise. 
(37) 

We find back (jTTJ and ( fT3| l as special cases of ( |33"j ) for U = 00 



and U < § (up to notation for the latter). 



D. SNR Wireless Model 

The setting is that where the bit rate function is 



f(r) = g log 
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« a 



r<R 



(38) 



with a > 2 the path loss exponent, C the Signal to Noise 
Ratio at distance 1 and R the transmission range. 

In the case when R is finite, we will limit ourselves to the 
fluid case and to the special case where a — 4 (the reason 
for the las assumption being that the relevant integral, namely 
^ log (l + ^) rdr, can be then explicitly computed). In this 
case, direct computations give that 
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The evaluation of the mean number of neighbors of a typical 
node, namely Nf = TrR 2 (3f, allows one to identify the mean 
number of orthogonal channels per unit space required to cope 
with the P2P load, namely 



PfNf = ttR 
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(40) 



In an infinite plane, this would require an infinite number of 
orthogonal channels, which is of course not feasible. However, 
it then makes sense to reuse spectrum in this case, to the cost 
of an decrease of C (resulting from an increase of the noise 
power due to the presence of distant interference). 

In this sense, this scheme makes sense under appropriate 
spectrum bandwidth assumptions, in the same way as the 
TCP scheme makes sense under appropriate network capacity 
assumptions. 

Notice that the integral log (l + ^) rdr is finite. This 
allows us to consider the wireless SNR model with an infinite 
range. In this case, the result is much simpler: for all a > 2, 
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VIII. Extensions of the Basic Model 

The aim of this section is to show that our analysis can 
be extended in several ways and take important practical phe- 
nomena into account. Unless otherwise stated, we will place 
ourselves in the fluid regime, but the dimensional analysis 
approach can be used with all extensions to relate the fluid 
limit to the real system through some function M. The only 
caveat is that if an extension introduces new parameters, M 
can be a function of several dimensionless variables instead 
of Nf only. This is illustrated by our first extension. 

A. Permanent Servers 

Assume that there exists some servers, or eternal seeders^] 
The motivation for considering this is for instance: (i) per- 
manent servers can solve the issue of chunk availability by 
being able to provide any asked chunk; (ii) this allows one 
to consider hybrid systems which combine classical server 
solutions and a P2P approach; (iii) with our model, the latency 
goes to 00 when A goes to (cf. (fT9]l), which is not a desirable 
effect; servers or permanent seeders seem a perfect solution 
to prevent this. 

We focus on the TCP case. 

The servers are characterized by their density of bitrate Uc, 
expressed in bit.s~\.mT2, so that if (3j is the peer density, a 
typical peer gets ^ from the servers. 

To describe the system, we need another dimensionless pa- 
rameter in addition to Nf. We conveniently choose xc '■= j% ■ 
Xc expresses the ratio between the density of rate needed by 
the system and the density of rate provided by the servers. If 
Xc > 1> men the permanent rate from servers is sufficient to 
serve the peers, otherwise P2P is needed for stability. 

Let us consider two limiting cases: the system is mainly 
client/server (xc ^ 1)> or me system is mainly P2P with a 
small server-assistance (xc "C 1). The case xc "C 1 can be 
seen as a scenario where servers are here mainly for insuring 
chunk availability. 

If Xc 3* 1) tnen almost all resources come from the servers. 
We can deduce that the point process is hard-core (even 
if Nf is large, if it is fixed and if xc grows, the servers 
can make newcomers leave before they have the occasion to 
reach another peer), so if a peer can collect all the available 
bandwidth in its range, the average latency will be 

F 

W c 



■kR 2 U c 

For xc <^ 1> we focus on the fluid limit 
Adapting (|8j, the rate of a peer is then 
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(42) 
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from which we deduce 



W f ,c = 



1 F — 



Uc 
A 



\2ttCR 



= Wf^l-xc~W f . (44) 



3 This is distinct from the case where leechers can seed for some time after 
they complete their download, which is addressed in |VIII-D| 



Let us point out that the behavior of ( |44| ) for \c close 
to 1 is not expected to be realistic, as the impact of the 
client/server behavior becomes prominent. For the hard-core 
process, one could also express Wh,c as something that tends 
to Wh if Xc tends to 0, which suggests that Mc{Nf,\c) 
admits a limit M c (N fl 0) = M(N f ) when \c tends to 0. 
In words, the results presented in previous sections still hold 
if one assumes the existence of servers with relatively small 
bandwidth introduced to inject chunks into the system. 

B. Abandonment 

Here we consider the case where all leechers have some 
abandonment rate. Let a denote this rate. In the stationary 
state, we have A = (jf + a)(3f. From ([8]), we deduce /sj + 
UfaF = 2irRC\F. The positive solution of this equation is 
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(45) 



The analysis can hence be extended without difficulties. For 
instance, the abandonment ratio is given by 



fi s +a±< 

C. Per Peer Rate Limitation 

Due to the asymmetric nature of certain access networks 
(e.g. ADSL), the uplink rate is often the most important 
access rate limitation. Let U denote (here) the average upload 
capacity of a peer; then the average rate in the fluid limit 
should be such that 



(46) 



(if = VXF2nCR < U. 

A natural dimensioning rule would then be to choose R = 
X P2nc m orc l er to use all the available capacity. 

D. Leechers and Seeders 

When a leecher has obtained all its chunks, it can become 
a seeder and remains such for a duration Tg. In this setting, 
there is a density of seeders AT$ in the stationary regime. 

In the fluid limit with seeders, (|8]l becomes 

Hf,s = (Pf,s + XTs)2ttCR. (47) 
Using ( fT0| ) and F = Wf t sHf,s, we get 

Wf t s + W f , s T s = Wf. 
The positive solution of this equation is 



(48) 



w LS = Jw 



f 



Ts 
2 ' 



(49) 



In particular, we have Wf t s ~ W/ for Tg <§C Wf and Wf,s ~ 
J for T s » W f . 

By comparing ( |49] i and ( f45] >, one can interpret seeding as the 
exact opposite of abandonment: seeders, which improve the 
system, impact the latency the same way that abandonment, 
which degrades the system, impacts the rate. 

We also remark that in a fluid model where rates are only 
determined by the upload access, we have (see ifTTI for details) 



We can see ( |48| l as the extension of ( p0[ l to the network-limited 
model. 

At last, we propose to study the hard-core limit. Without 
seeder, a leecher can leave only if it finds a peer within range, 
and instant service happens with probability one half. With 
seeders, a leecher is certain to complete its download if there 
is another peer in its neighborhood, as the latter will not leave 
the system before the former finishes. We can then notice that 
the configuration of peers (leechers and seeders) includes a 
spatial Poisson distribution of density AT5. In particular, the 
probability for a newcomer to find a peer within range R is at 
least l-e- ATs7ri?2 . Therefore, for any e > 0, if T s > ~ A '°|i e) , 
then leechers will get instant service with a probability greater 
than 1 — e. 

This suggests that seeders may be a good antidote for 
systems where a hard-core behavior cannot be avoided: a 
seeding time of the same order of magnitude than the average 
latency in absence of seeders is enough to guarantee that most 
of the peers get instant download. 

E. Adaptive Range 

Consider the constant number of nearest peers model of 
Section [Til] In the fluid limit, which can be reached by 
increasing L until it identifies to Nf, an approximate version 
of this model is obtained by considering a range model with 
radius R such that R, the density /3 and the target number of 
neighbors L verify 



irR 2 p = L. 



(51) 



In this case, /i(x, is as in (7 1 but with R = , 

In this section, we consider a general model with R = K/3~~ a 
with a a real parameter. The constant radius ball corresponds 
to the case a = and the L nearest neighbor case to a — \. 



Note that as j3 depends on R, R 
fixed point equation for a > 0. 



re/3 " has to be seen as a 



By dimensional analysis, one gets that for all a 7^ h, all 



properties of the system only depend on the parameter 

XF 3 



(52) 



For a = i (nearest peers), the parameter is p = re (or 
equivalently L). 

The fluid analysis gives p,f 



2 7 rCre/3 1 - Q , so that 



W f 
/'■/ 



AF 
27rCre / 
A"3^f F^(27rCVT^ 
(2itCk)^(XF)^. 



(53) 



Notice that the algorithm which leads to this hence consists in 
choosing a radius of the form R — 



A£_V 



Wf^s +T S = Wf. 



(50) 



For instance 

in the constant number of nearest peers TCP case, we get 

WW = (Zt) 3 (-^t) 3 • (54) 



This is an interesting result: it means that in the fluid limit, 
TCP can achieve super-scalability even if each peer has a 
limited number of neighbors. 

This is not the case for UDP, where the latency is Wtjdp = 
(we still have scalability though). 

We conclude this subsection by an asymptotic analysis 
where all parameters are fixed but for A which tends to infinity. 
We assume we are in the fluid regime (which will lead to some 
restrictions on the set of parameters). 

In view of p3|), we will call d = the density exponent, 
I = |^ the latency exponent and r = a/ (a — 2) the radius 
exponent. We have the conservation rule d — I = 1, which 

1— 2a 

is just a rephrasing of Little's law. Similarly Nf = KX , 
with K a constant. So, for A tending to oo, the fluid regime 
requires that either a > 2 or a < |. 

Hence, there are 2 regimes when A — > oo: 

• For a > 2, (which corresponds to 1 < r < oo) one gets 
at the same time d < and I < 0, which means a peer 
density and a latency which both tend to when A tends 
to oo. This is a rather surprising regime: the load per 
unit time and space tends to infinity; the density tends 
to (there are no peers around for delivering service); 
nevertheless, latency tends to (i.e. when a peer arrives, 
it is instantly served by invisible peers located at infinity). 
We will call this regime Heaven 's— flash. 

• For a < \ (which corresponds to —1/3 < r < 1), one 
gets d > and I < 0, which means a peer density that 
tends to infinity and a latency which tends to zero when 
A tends to oo. This is the swarm-flash regime. 

Notice the possible existence of a critical-flash regime, with 
r = 1, a = oo, d = and I = — 1, where the density 
is a constant and the latency tends to 0. Another interesting 
though critical case is that where a = 1/2, where the structural 
properties of the system do not depend on A anymore as shown 
by dimensional analysis. 

F. Mixed Extensions 

The proposed extensions, presented separately for sake of 
clarity, can easily be interleaved, at least in the fluid limit. For 
instance, combining ( [33) and |49|, the average latency of a 
system with seeders and a rate function parameter 7 (cf |VH| ) 
is 

2 



The positive solution of this equation is 



W f 




(55) 



In order to illustrate the fact that the above extensions are 
compatible, we analyze this case in the setting where the 
uplink limitation is taken into account. 

Hf = ((3 f +XT s )2n [ (C/r)AUrdr = ((3 f +XT s )^(C, R, U) 

(56) 

From Little's law applied to the leechers, @f — XF/pf. 
Hence 



Pt 2 S U 1+ XTiaC,R,U) 



- 1 , (57) 



which is an increasing function of A. Since Wf = Pf/X, 




xt§S(c,r,u) 



(58) 



One can then mary this with the various ways of defining R 
as a function of A. 

IX. Conclusion 

The following general law quantifying P2P super-scalability 
was identified: in a P2P system with rate function g and range 
R, according to our model, the stationary latency is of the 

form 

lir 2 R 4 XF\ [¥ 

-Ws? <59) 



W Q = M 



7 



with 7 = 2ir j g(r)dr and with M(x) a function which is 
larger than 1 and tends to 1 when x tends to infinity (if there 
is no range, ( |59| ) can still be used with the typical range R 
defined in|W]), 

Both in the TCP case, i.e. for g(r) — — , and in the UDP 



case, i.e. for g(r) = C, the function x - 
(and has an explicit approximation). 

With a decreasing M, Equation (|59 
of super-scalability. First, there is the 



• M(x) is decreasing 

exhibits two causes 
^5= super-scalability 

that comes from the fluid term Wf — \J~j^j- This is the same 
type of super-scalability that was observed in the toy example. 
But there is also a super-scalability that comes from M, 
which expresses the surprising fact that increasing the arrival 
rate reduces the slow-down due to the repulsion phenomenon 
identified in the paper. For Nf large enough, the main cause 
of scalability is Wf, but otherwise, the effect of M on super- 
scalability is not to be neglected. 

The conditions for the super-scalability formula |59) to 
hold were also identified: First, the network should have the 
capacity to cope with the P2P traffic. This translates into the 
requirement 



2XF 

7 



2 g{r)dr 



(60) 



where 9 is the spatial intensity of routers and K the typical link 
capacity. In words, the linear capacity of the network should 
scale like A if other parameters are unchanged. Secondly the 
access should not be the bottleneck, which translates into the 
requirement 

U > v/AF7, (61) 

where U the (total) upload capacity of each peer. In words, 
the latter should scale like the square root of A. 

We remark that the link capacity requirement is larger than 
the access requirement, which intuitively supports our initial 



motivation, which was that in future (wired) networks, the 
bottleneck should not be the access anymore. 

Note that we are fully aware of the fact that, in the the hard- 
core regime, our model might fail due to the lack of adequate 
representation of the chunk level. We expect chunk availability 
to become a crucial bottleneck in hard-core. So, if Nf := 
R * F <C 1, our conclusions are probably overestimating the 
actual performance. 

One of the future challenges in the research started by this 
paper is the extension to chunk-level modeling. Considering 
chunks leads to the issue of data availability, and a chunk- 
based system may be, in some scenarios, less stable that the 
models considered in this paper. For instance, a missing piece 
syndrome may be encountered in the form of growing spatial 
subpopulations missing at least one chunk. Parameters like 
the degree of altruism and the spatial intensity of permanent 
seeders can be expected to appear in the characterization of a 
stable regime. 

X. Appendix: Proof of Proposition 1 (Sketch) 

Choose a number zq > such that f(zo) > and split 
D into cells with diameters at most Zq. Then all peers in 
a cell with population higher than one receive service at 
least at rate f(z ). It follows that the population of each 
cell is stochastically dominated by an M/M/oo queue that 
is modified so that a lone customer cannot leave. Since such 
queues are stable with any input rate, the distribution of 
(|$ t | : t > 0) is tight, whatever the initial state $ . The 
ergodicity can now be shown by a standard coupling argument: 
two realizations with different initial states but same arrival 
process couple in finite time. □ 

XI. Appendix: Proof of Theorem 1 

We work here on the torus T of area D. Let d denote the 
distance on T and m the Haar measure. Let / : (0,oo) — > 
(0, oo) be a positive function, and let <P t be the state of the 
SBD at time t. For x € T, let 



f{\\x - xo||)m(dx). 



By translation invariance, a is independent of the choice of 
Xq. Further, the left hand side of the claim can be expressed 
as 

E[£ f(\\xi\\)]=E(N )^. (62) 



Consider now the P2P dynamics on T in steady state. For all 
X e $ t , let 

MX) = £ /(||*-y||) (63) 

Ye<S>t,Y^X 

At = MX), (64) 

xe<s> t 



F is equal to 1). The right hand side of the claim can be 
written as 

Eo[ /(INI)]= E o(4)(0)). (65) 

Xi€0\{O} 

By the rate conservation principle (e.g., [22 1, 1.3.3), applied 
to the stochastic process N t — $t(T), we get 

XD = E(Ao) = E(A^o)E (^o(0)), (66) 

with Eo the (spatial) Palm probability of <£>o- This relation 
says that the birth rate = XD should balance the death rate 
= E(A). The relation E(A ) = E(iV o )E o (A o (0)) follows 
from the definition of the Palm probability. 

Let E^ denote the (time) Palm probability of the SBD at 
birth epochs and E^ that at death epochs. The rate conservation 
principle applied to the stochastic process (total rate) A t , that 
we assume cadlag, gives 

r T E T (2T) = r l E l (V) 



with I = Ao+ — Ao the total rate increase and T> = Aq - 
the (absolute value of the) total rate decrease. Since r T = H% 
we get that 

E t (I)=E i (I>). 

and the fact that births are 

a 
D 

The (total) death point process admits a stochastic intensity 
w.r.t. the filtration Tt — a(^ s ,s < t) equal to A t . Hence, it 
follows from Papangelou's theorem (e.g., l22l . Theorem 1.9.2) 
that 

d¥± _ A 
rfP EfAn) ' 



From the PASTA property 
uniform on T, 

E T (J) = 2E(iV 



Since the decrease (in state $o-) is of magnitude Ag(X) (w.r.t. 

$o-X we get 



$o-) with probability j4 ° (x - ) (w.r.t 



E-'-(D) 
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Aq 

E(A 



E 



A (X) 



A (X) 



E £ (Ao(X)f 

o V Xg$ ° / 

E( £ Ao(X)) 
Hence, when using the fact that 
E ( V (A Q (X))A 

V^o J _ E ((A (o)) 2 ) 

Eo(A (0)) ' 
the rate conservation principle for total rate gives: 



e E MX) 



a _ E ((A (0)f) 
E{No) D Eo (A)(0)) 



(67) 



where A t (X) is the death rate of point X and A t is the total Recalling ( |62] l and ( |65j ), we now note that Theorem [TJ 
death rate of the SBD (here we assume that the mean file size follows from the fact that Eq ((Aq(0)) 2 ) > Eq (Aq(0)) 2 . 



XII. Appendix: Sketch of Proof of Theorem 2 



XIII. Appendix: Justification of the Heuristic 



Assume for simplicity that / is bounded. We proceed as 
in the fluid limit of a queue, by scaling the arrival and 
service rates appropriately, and consider a sequence of systems 
indexed by n, where n is a parameter that tends to infinity. 
Our assumption is that the arrival rate in system n is A n = An, 
and the mean file size in system n is F n = Fn. 

We tessellate the plane with a grid made of squares of side 
5, and time with a grid of width 77. Hence, the mean number 
of arrivals in a typical square and a typical time interval is 
XnS 2 i] for all n. In addition, the strong law of large numbers 
(SLLN) shows that the random number of arrivals A l n in a 
typical square in the time interval (t,t + rj) is such that A^/n 
tends a.s. to the constant Xf]5 2 when n tends to infinity. 

The next task is to show that the number of peers N^(k, I) 
present at time t in the square with coordinates (kS, IS) is 
such that N^(k, l)/n converges a.s. to some deterministic limit 
/3*(fc, l)S 2 . We then get that the number of deaths in this square 
in the time interval [t, t + 77), denoted by D^{k, I), satisfies 

lim -D\{k,l) 
= i/3*(fc, ]T f[(p6, qS), (kS, lS)]^(p, q)S i i 1 . 



This follows from the fact that the probability that a typical 
peer in the square {kS, IS) dies approximately with probability 

y- tt(P 6 > ( fc(5 ' ^MtP. l) that tends to 



^J2f[{pS,qS),(kS,lS)]p t (p,q)S\ 



so that the number of deaths tends to the announced limit. 
(Notice however that this discretization does not make sense 
for, e.g., f{x, y) = C/\x - y\, as f[(kS, IS), (kS, IS)} = 00.) 

Hence, by letting 6 and r\ tend to 0, we get that the function 
P l ix) which is the value of the density at x e M. 2 at time t in 
the fluid regime satisfies the differential equation 

~P\x) = X-^ / fix,y)/3 t (y)dy. (68) 
at r , m2 



The steady state of this is 



A 



7^ / f(x,v)P(v)dy. 



Pix) F . 
A translation invariant solution of this is 

f = — , 

J R 2 f{x,y)dy 

which is the "fluid solution". 



In order to derive the heuristic of Section |IV-E| we use 
the balance equation for the second order factorial moment 
density, which reads 



2j3 D X = 2m[ 2 ](x,j/) 



C ^\\x-y\\<R 

F \\x-y\\ 



(69) 



^ / m [3] (x,y, Z ) ('f^+'f^f)^, 

F Jd V If - z \\ \\y- z \\ J 

for all x and y. We then use the following approximations: 

m[2](x,y)m[2](x,z) 



m\s\ix,y,z) w 

Tn[s\(x,y,z) « 

Then, we get from ( |69| that 
CI 



m [2 ]ix,y)m [2 ] jy,z) 

Po 



PoX rj m [2 ]ix,y) 



\\x-y\\<R 
F \\x — y\\ 



, s CX f l \\x-z\\<Rm[2)ix,Z) 

+m [2 ]ix,y) — - I — =r^ dz 



F2J D \\x-z\\ p o 
-m [2] ix,y)^ [ 



that is 



™>[ 2 ](x,y) ~ XF 



F2J D \\y-z\\ Po 

Po 



ci. 



||x-»|| "^Mo 



k-!/ll<H 

I 

^i^l^dz. So 



(70) 



with fi —: C f B{0 R) j o p||. 

Mo a \F2ttC j Q R j^dr 

= ^2^(^-1^(1 + ^)), 
which is our departure point in Section |IV-E| 
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